Simile recognition involves two subtasks: simile sentence classification that discriminates whether a sentence contains simile, and simile component extraction that locates the corresponding objects (i.e., tenors and vehicles). Recent work ignores features other than surface strings. In this paper, we explore expressive features for this task to achieve more effective data utilization. Particularly, we study two types of features: 1) input-side features that include POS tags, dependency trees and word definitions, and 2) decoding features that capture the interdependence among various decoding decisions. We further construct a model named HGSR, which merges the input-side features as a heterogeneous graph and leverages decoding features via distillation. Experiments show that HGSR significantly outperforms the current state-of-the-art systems and carefully designed baselines, verifying the effectiveness of introduced features. Our code is available at https://github.com/DeepLearnXMU/HGSR.
translated by 谷歌翻译
在医学图像分析中,许多疾病的微妙视觉特征要具有挑战性,尤其是由于缺乏配对数据。例如,在温和的阿尔茨海默氏病(AD)中,很难从纯成像数据中观察到脑组织萎缩,尤其是没有配对的AD和认知正常(CN)数据以进行比较。这项工作介绍了疾病发现甘(Didigan),这是一种基于弱的基于风格的框架,可发现和可视化细微的疾病特征。 Didigan了解了AD和CN视觉特征的疾病歧管,并将此歧管采样的样式代码施加到解剖结构“蓝图”上,以综合配对AD和CN磁共振图像(MRIS)。为了抑制生成的AD和CN对之间的非疾病相关变化,Didigan利用具有循环一致性和抗偏置的结构约束来实施解剖对应关系。当对阿尔茨海默氏病神经影像学计划(ADNI)数据集进行测试时,Didigan通过合成的配对AD和CN扫描显示了关键的AD特征(减少海马体积,心室增大和皮质结构的萎缩)。定性结果通过自动化的大脑体积分析来支持,其中还测量了脑组织结构的系统成对降低
translated by 谷歌翻译
最近,机器学习(ML)电位的发展使得以量子力学(QM)模型的精度进行大规模和长期分子模拟成为可能。但是,对于高水平的QM方法,例如在元gga级和/或具有精确交换的密度函数理论(DFT),量子蒙特卡洛等,生成足够数量的用于训练的数据由于其高成本,计算挑战性。在这项工作中,我们证明了基于ML的DFT模型Deep Kohn-Sham(Deepks)可以在很大程度上缓解这个问题。 DeepKS采用计算高效的基于神经网络的功能模型来构建在廉价DFT模型上添加的校正项。在训练后,DeepKs提供了与高级QM方法相比,具有紧密匹配的能量和力,但是所需的训练数据的数量是比训练可靠的ML潜力所需的数量级要小。因此,DeepKs可以用作昂贵的QM型号和ML电位之间的桥梁:一个人可以生成相当数量的高准确性QM数据来训练DeepKs模型,然后使用DeepKs型号来标记大量的配置以标记训练ML潜力。该周期系统方案在DFT软件包算盘中实施,该计划是开源的,可以在各种应用程序中使用。
translated by 谷歌翻译
图神经网络〜(GNNS)是用于图表学习的有效工具。大多数GNN依靠递归邻里聚合方案,称为消息传递,因此其理论表达力仅限于第一阶Weisfeiler-Lehman测试(1-WL)。受到基于检索的模型和现成的高性能检索系统的成功的激励,我们提出了一种称为GraphRetReval的非参数和模型 - 敏捷方案,以增强现有的GNN模型。在GraphRetRieval中,与其地面真实标签相关的类似训练图被检索为可以与输入图表示共同利用的增强功能,以完成各种图形属性预测任务。特别是,为了有效地从检索的图中“吸收”有用的信息,并“忽略”可能的噪声,我们引入了基于自我注意的适配器,以明确了解输入图与其检索到的类似图之间的相互作用。通过在12个不同的数据集上尝试三个经典的GNN模型,我们证明了GraphRetReval能够为现有GNN模型带来实质性改进,而无需包括模型大小和预测效率。我们的工作还首先验证了检索增强图神经网络的可行性和有效性。
translated by 谷歌翻译
The success of deep learning heavily relies on large-scale data with comprehensive labels, which is more expensive and time-consuming to fetch in 3D compared to 2D images or natural languages. This promotes the potential of utilizing models pretrained with data more than 3D as teachers for cross-modal knowledge transferring. In this paper, we revisit masked modeling in a unified fashion of knowledge distillation, and we show that foundational Transformers pretrained with 2D images or natural languages can help self-supervised 3D representation learning through training Autoencoders as Cross-Modal Teachers (ACT). The pretrained Transformers are transferred as cross-modal 3D teachers using discrete variational autoencoding self-supervision, during which the Transformers are frozen with prompt tuning for better knowledge inheritance. The latent features encoded by the 3D teachers are used as the target of masked point modeling, wherein the dark knowledge is distilled to the 3D Transformer students as foundational geometry understanding. Our ACT pretrained 3D learner achieves state-of-the-art generalization capacity across various downstream benchmarks, e.g., 88.21% overall accuracy on ScanObjectNN. Codes will be released at https://github.com/RunpeiDong/ACT.
translated by 谷歌翻译
Current natural language processing (NLP) models such as BERT and RoBERTa have achieved high overall performance, but they often make systematic errors due to bias or certain difficult features to learn. Thus research on slice detection models (SDM) which automatically identifies underperforming groups of datapoints has gradually caught more attention, which aims at both understanding model behaviors and providing insights for future model training and designing. However, there is little systematic research on SDM and quantitative evaluation of its assessment for NLP models. Our paper fills this gap by proposing "Discover, Explanation, Improvement" framework that discovers coherent and underperforming groups of datapoints and unites datapoints of each slice under human-understandable concepts; it also provides comprehensive evaluation tasks and the corresponding quantitative metrics, which enable convenient comparison for future works. Results show that our framework can accurately select error-prone datapoints with informative semantic features that summarize error patterns, based on which it directly boosts model performance by an average of 2.85 points based on trained models without tuning any parameters across multiple datasets.
translated by 谷歌翻译
预训练的语言模型在对话任务上取得了长足的进步。但是,这些模型通常在表面对话文本上进行训练,因此被证明在理解对话环境的主要语义含义方面是薄弱的。我们研究抽象含义表示(AMR)作为预训练模型的明确语义知识,以捕获预训练期间对话中的核心语义信息。特别是,我们提出了一个基于语义的前训练框架,该框架通过三个任务来扩展标准的预训练框架(Devlin等,2019)。根据AMR图表示。关于聊天聊天和面向任务的对话的理解的实验表明了我们的模型的优势。据我们所知,我们是第一个利用深层语义表示进行对话预训练的人。
translated by 谷歌翻译
近年来,在各种特定于任务的情况下,盲目图像质量评估(BIQA)取得了巨大的成功,这些方案呈现出不变的失真类型和评估标准。但是,由于刚性结构和学习框架,它们不能应用于交叉任务BIQA方案,在这种情况下,失真类型和评估标准在实际应用中不断变化。本文提出了一个可扩展的增量学习框架(SILF),该框架可以在多个评估任务中依次执行BIQA,具有有限的记忆能力。更具体地说,我们开发了动态参数隔离策略,以依次更新特定于任务的参数子集,这些参数子集彼此之间并非重叠。每个参数子集都会暂时解决,以记住对其相应任务的一个评估偏好,并且可以在以下BIQA中自适应地重复使用先前的参数子集,以根据任务相关性实现更好的性能。为了抑制顺序任务学习中记忆容量的不受限制扩展,我们通过从先前解决的参数子集中逐渐和选择性地修剪不重要的神经元来开发可扩展的内存单元,这使我们能够忘记以前的经验的一部分,并释放有限的内存能力,以适应适应新的新任务。对11个IQA数据集进行的广泛实验表明,我们提出的方法在交叉任务BIQA中的其他最新方法显着优于其他最新方法。
translated by 谷歌翻译
我们为致密氢的方程式提供了基于深层生成模型的变化自由能方法。我们采用归一化流网络来对质子玻尔兹曼分布和费米子神经网络进行建模,以在给定的质子位置对电子波函数进行建模。通过共同优化两个神经网络,我们达到了与先前的电子蒙特卡洛计算相当的变异自由能。我们的结果表明,与先前的蒙特卡洛和从头算分子动力学数据相比,行星条件下的氢甚至更浓密,这远离经验化学模型的预测。获得可靠的密集氢状态方程,尤其是直接进入熵和自由能,为行星建模和高压物理学研究开辟了新的机会。
translated by 谷歌翻译
机器学习辅助建模的原子势能表面(PES)正在彻底改变分子模拟的领域。随着高质量电子结构数据的积累,可以在所有可用数据上鉴定的模型,并在下游任务上以较小的额外努力进行填充,这将使该领域进入新阶段。在这里,我们提出了DPA-1,这是一种具有新颖的注意机制的深层潜在模型,该模型非常有效地表示原子系统的构象和化学空间并学习PES。我们在许多系统上测试了DPA-1,并且与现有基准相比,观察到了卓越的性能。当在包含56个元素的大规模数据集上进行预估计时,DPA-1可以成功应用于各种下游任务,并有很大的提高样品效率。令人惊讶的是,对于不同的元素,学习的类型嵌入参数在潜在空间中形成$螺旋$,并具有自然对应的元素性表位,显示了预审预周化的DPA-1模型的有趣解释性。
translated by 谷歌翻译